Comparative phonetic analysis and phoneme recognition for Afrikaans, English and Xhosa using the African Speech Technology telephone speech databases

نویسندگان

  • Thomas Niesler
  • Philippa H. Louw
چکیده

This paper concerns the Afrikaans, English and Xhosa speech databases recently developed as part of the African Speech Technology project. The three corpora are analysed and compared in terms of their phonetic content, diversity and mutual overlap. Connected phoneme recognition systems are subsequently developed and tested in each language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

African speech technology (AST) telephone speech databases: corpus design and contents

The African Speech Technology project is developing telephone speech databases for five of South Africa’s eleven official languages, i.e. South African English, Afrikaans, and three African languages, Zulu, Xhosa, and Southern Sotho. These databases will be fully transcribed – orthographically and phonetically – and will be used for the training and testing of phoneme-based, speaker-independent...

متن کامل

Phonetic analysis of Afrikaans, English, Xhosa and Zulu using South African speech databases

We present a corpus-based analysis of the Afrikaans, English, Xhosa and Zulu languages, comparing these in terms of phonetic content, diversity and mutual overlap. Our aim is to shed light on the fundamental phonetic interrelationships between these languages, with a view to furthering progress in multilingual automatic speech recognition in general, and in the South African region in particular.

متن کامل

Language-dependent State Clustering for Multilingual Speech Recognition in Afrikaans, South African English, Xhosa and Zulu

The development of automatic speech recognition systems requires significant quantities of annotated acoustic data. In South Africa, the large number of spoken languages hampers such data collection efforts. Furthermore, code switching and mixing are commonplace since most citizens speak two or more languages fluently. As a result a considerable degree of phonetic cross pollination between lang...

متن کامل

The African Speech Technology Project: An Assessment

This paper reflects on the recently completed African Speech Technology (AST) Project. The AST Project successfully developed eleven annotated telephone speech databases for five languages spoken in South Africa i.e. Xhosa, Southern Sotho, Zulu, English and Afrikaans. These databases were used to train and test speech recognition systems applied in a multilingual telephone-based prototype hotel...

متن کامل

Language Identification and Multilingual Speech Recognition Using Discriminatively Trained Acoustic Models

We perform language identification experiments for four prominent South-African languages using a multilingual speech recognition system. Specifically, we show how successfully Afrikaans, English, Xhosa and Zulu may be identified using a single set of HMMs and a single recognition pass. We further demonstrate the effect of language identification-specific discriminative acoustic model training ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • South African Computer Journal

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2004